Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

نویسندگان

A. Sayadianii Department of Product and Services, Tamin Telecom Co.(3G mobile operator), Tehran, Iran (e-mail: [email protected])

M. Eslami Corresponding Author, Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran (e-mail: [email protected]).

چکیده مقاله:

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality reduction. In this paper, after introducing GMM2 method, several GMM models will be used to model each phoneme. Furthermore, in the stage of corresponding the clusters of each state, before applying Dynamic Time Warping algorithm, we use a LMR conversion for further correspondence among the parameters of two corresponding states of two speakers. Another reason for quality reduction in voice conversion system is that the precision of speech signal parameters was underestimated. In order to overcome such a problem, Generalized Harmonic Model is introduced which is replaced by sinusoid harmonic model applied in GMM2 giving another method called GMM3. Finally, we will present GMM4 method, the objective of which is to promote the system performance with limited data and a restricted number of demi-syllables to train conversion functions.

Download for Free

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Statistical Sample-Based Approach to GMM-Based Voice Conversion Using Tied-Covariance Acoustic Models

This paper presents a novel statistical sample-based approach for Gaussian Mixture Model (GMM)-based Voice Conversion (VC). Although GMM-based VC has the promising flexibility of model adaptation, quality in converted speech is significantly worse than that of natural speech. This paper addresses the problem of inaccurate modeling, which is one of the main reasons causing the quality degradatio...

متن کامل

on the quality improvement of voice conversion systems based on gmm model

in a voice conversion system speech signal of a speaker (i.e. source speaker) is modified so that it sounds as if it had been pronounced by b speaker (i.e. target speaker). this process, sometimes, is called speaker conversion (changing speaker identity). achieved signal from speaker conversion system is desired to have high quality and very natural. to satisfy this, three major methods are pro...

متن کامل

Eigenvoice-based Approach to Voice Conversion and Voice Quality Control

This paper reviews our proposed approach to voice conversion (VC) and voice quality control based on an eigenvoice technique. VC is a technique to modify nonlinguistic information such as speaker individuality while keeping linguistic information unchanged. In the traditional VC framework, a conversion model for a source and target speaker-pair needs to be trained in advance using a parallel da...

متن کامل

on the relationship between using discourse markers and the quality of expository and argumentative academic writing of iranian english majors

the aim of the present study was to investigate the frequency and the type of discourse markers used in the argumentative and expository writings of iranian efl learners and the differences between these text features in the two essay genres. the study also aimed at examining the influence of the use of discourse markers on the participants’ writing quality. to this end the discourse markers us...

15 صفحه اول

Probabilistic Voice Conversion Using Gaussian Mixture Models

This paper explores the topic of voice conversion as explored in a joint project with Percy Liang (EECS, Berkeley). For our purposes, voice conversion is the process of modifying the speech signal of one speaker (source) so that it sounds as thought it had been pronounced by a different speaker (target). By using a Gaussian mixture model (GMM) to model the features of the source speaker, we can...

متن کامل

Enhancement of Esophageal Speech Using Statistical Voice Conversion

This paper presents a novel method of enhancing esophageal speech based on statistical voice conversion. Esophageal speech is one of the speaking methods for total laryngectomees. Although it allows laryngectomees to speak by generating a sound source and articulating it to produce audible speech sounds using their esophagus and vocal organs, the generated voices sound unnatural. To improve the...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

عنوان ژورنال

International Journal of Modeling, Identification, Simulation and Control

دوره 43 شماره 2

صفحات 11- 17

تاریخ انتشار 2011-11-01

دنبال کردن

لغو دنبال کردن

{@ msg @}

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

کلمات کلیدی

High quality voice conversion Gaussian mixed model (GMM) Generalized Harmonic Model (GHM) spectral conversion

میزبانی شده توسط پلتفرم ابری doprax.com